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We consider an environment where players are involved in a public goods game and must decide repeatedly 
whether to make an individual contribution or not. However, players lack strategically relevant information 
about the game and about the other players in the population. The resulting behavior of players is completely 
uncoupled from such information, and the individual strategy adjustment dynamics are driven only by reinforce¬ 
ment feedbacks from each player’s own past. We show that the resulting “directional learning” is sufficient to 
explain cooperative deviations away from the Nash equilibrium. We introduce the concept of fc—strong equilib¬ 
ria, which nest both the Nash equilibrium and the Aumann-strong equilibrium as two special cases, and we show 
that, together with the parameters of the learning model, the maximal fc—strength of equilibrium determines the 
stationary distribution. The provisioning of public goods can be secured even under adverse conditions, as long 
as players are sufficiently responsive to the changes in their own payoffs and adjust their actions accordingly. 
Substantial levels of public cooperation can thus be explained without arguments involving selflessness or social 
preferences, solely on the basis of uncoordinated directional (mis)leaming. 


Cooperation in sizable groups has been identified as one 
of the pillars of our remarkable evolutionary success. While 
between-group conflicts and the necessity for alloparental 
care are often cited as the likely sources of the other-regarding 
abilities of the genus Homo |llll31, it is still debated what made 
us the “supercooperators” that we are today J&01- Research 
in the realm of evolutionary game theory IdI- H^ has identified 
a number of different mechanisms by means of which coop¬ 
eration might be promoted ifTTl [l^ . ranging from different 
types of reciprocity and group selection to positive interac¬ 
tions id, risk of collective failure d, and static network 
structure IITSlfl^ . 

The public goods game d, in particular, is established 
as an archetypical context that succinctly captures the social 
dilemma that may result from a conflict between group inter¬ 
est and individual interests iflsifT^ . In its simplest form, the 
game requires that players decide whether to contribute to a 
common pool or not. Regardless of the chosen strategy by the 
player himself, he receives an equal share of the public good 
which results from total contributions being multiplied by a 
fixed rate of return. For typical rates of return it is the case 
that, while the individual temptation is to free-ride on the con¬ 
tributions of the other players, it is in the interest of the col¬ 
lective for everyone to contribute. Without additional mecha¬ 
nisms such as pun ishment ll^ . contribution decisions in such 
situations ifl^H^ approach the free-riding Nash equilibrium 
121] over time and thus lead to a “tragedy of the commons” 
1 2211 . Nevertheless, there is rich experimental evidence that 
the contributions are sensitive to the rate of return ll^ and 
positive interactions ifl^ . and there is evidence in favor of the 
fact that social preferences and beliefs about other players’ de¬ 
cisions are at the heart of individual decisions in public goods 
environments 1^ . 

In this paper, however, we shall consider an environment 
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where players have no strategically relevant information about 
the game and/ or about other players, and hence explanations 
in terms of social preferences and beliefs are not germane. In¬ 
stead, we shall propose a simple learning model, where play¬ 
ers may mutually reinforce learning off the equilibrium path. 
As we will show, this phenomenon provides an alternative and 
simple explanation for why contributions rise with the rate of 
return, as well as why, even under adverse conditions, public 
cooperation may still prevail. Previous explanations of this ex- 
perimenta l regula rity ifisl] are based on individual-level costs 
of ‘eiTor’ i25Ll26l] . 

Suppose each player knows neither who the other players 
are, nor what they earn, nor how many there are, nor what 
they do, nor what they did, nor what the rate of return of the 
underlying public goods game is. Players do not even know 
whether the underlying rate of return stays constant over time 
(even though in reality it does) because their own payoffs 
are changing due to the strategy adjustments of other play¬ 
ers, about which they have no information. Without any such 
knowledge, players are unable to determine ex ante whether 
contributing or not contributing is the better strategy in any 
given period, i.e., players have no strategically relevant infor¬ 
mation about how to respond best. As a result, the behavior 
of players has to be completely uncoupled ll^l2^ . and their 
strategy adjustmen t dy namics are likely to follow a form of 
reinforcement feedback or, as we shall call it, direc¬ 
tional learning We note that, in our model, due to the 

one-dimensionality of the strategy space, reinforcement and 
directional learning are both adequate terminologies for our 
learning model. Since reinforcement applies also to general 
strategy spaces and is therefore more general we will prefer 
the terminology of directional learning. Indeed, such direc¬ 
tional learning behavior has been observed in recent public 
goods experiments O . The important question is how 
well will the population learn to play the public goods game 
despite the lack of strategically relevant information. Note 
that well here has two meanings due to the conflict between 
private and collective interests: on the one hand, how close 
will the population get to playing the Nash equilibrium, and. 
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on the other hand, how close will the population get to playing 
the socially desirable outcome. 

The learning model considered in this paper is based on a 
particularly simple “directional learning” algorithm which we 
shall now explain. Suppose each player plays both cooper¬ 
ation (contributing to the common pool) and defection (not 
contributing) with a mixed strategy and updates the weights 
for the two strategies based on their relative performances in 
previous rounds of the game. In particular, a player will in¬ 
crease its weight on contributing if a previous-round switch 
from not contributing to contributing led to a higher realized 
payoff or if a previous-round switch from contributing to not 
contributing led to a lower realized payoff. Similarly, a player 
will decrease its weight on contributing if a previous-round 
switch from contributing to not contributing led to a higher 
realized payoff or if a previous-round switch from not con¬ 
tributing to contributing led to a lower realized payoff. For 
simplicity, we assume that players make these adjustments 
at a fixed incremental step size 6, even though this could 
easily be generalized. In essence, each player adjusts its 
mixed strategy directionally depending on a Markovian per¬ 
formance assessment of whether a previous-round contribu¬ 
tion increase/decrease led to a higher/lower payoff. 

Since the mixed strategy weights represent a well-ordered 
strategy set, the resulting model is related to the directional 
learning/ aspiration adjustment models llrlll^l^ . and simi¬ 
lar models have previously been proposed for bid adjustments 
in assignment games 11^ , as well as in two-player games 
ll^ . In the dynamic leads to stable cooperative outcomes 
that maximize total payoffs, while Nash equilibria are reached 
in il. The crucial difference between these previous stud¬ 
ies and our present study is that our model involves more than 
two players in a voluntary contributions setting, and, as a re¬ 
sult, that there can be interdependent directional adjustments 
of groups of players including more than one but not all the 
players. This can lead to uncoordinated (mis)learning of sub¬ 
populations in the game. 

Consider the following example. Suppose all players in a 
large standard public goods game do not contribute to start 
with. Then suppose that a player in a subpopulation uncoor- 
dinatedly but by chance simultaneously decide to contribute. 
If this group is sufficiently large (the size of which depends 
on the rate of return), then this will result in higher payoffs 
for all players including the contributors, despite the fact that 
not contributing is the dominant strategy in terms of unilat¬ 
eral replies. In our model, if indeed this generates higher 
payoffs for all players including the freshly-turned contrib¬ 
utors, then the freshly-turned contributors would continue to 
increase their probability to contribute and thus increase the 
probability to trigger a form of stampede or herding effect, 
which may thus lead away from the Nash equilibrium and to¬ 
wards a socially more beneficial outcome. 

Our model of uncoordinated but mutually reinforcing de¬ 
viations away from Nash provides an alternative explanation 
for the following regularity that has been noted in experiments 
on public goods provision ifisll . Namely, aggregate contri¬ 
bution levels are higher the higher the rate of return, despite 
the fact that the Nash equilibrium remains unchanged (at no¬ 


contribution). This regularity has previously been explained 
only at an individual level, namely that ‘errors’ are less costly 
- and therefore more likely - the higher the rate of return, fol¬ 
lowing quantal-response equilibrium arguments ll^l2^ . By 
contrast, we provide a group-dynamic argument. Note that 
the alternative explanation in terms of individual costs is not 
germane in our setting, because we have assumed that players 
have no information to make such assessments. It is in this 
sense that our explanation perfectly complements the expla¬ 
nation in terms of costs. 

In what follows, we present the results, where we first set up 
the model and then deliver our main conclusions. We discuss 
the implications of our results in section 3. Further details 
about the applied methodology are provided in the Methods 
section. 


Results 

Public goods game with directional learning 

In the public goods game, each player i in the population 
N = 1, 2,..., n chooses whether to contribute (cj = 1) or not 
to contribute (ci = 0) to the common pool. Given a fixed 
rate of return r > 0, the resulting payoff of player i is then 
Ui = {i — Ci) + {r/n)*J2jeN ^ 3 - We shall call r/n the game’s 
marginal per-capita rate of return and denote it as R. Note 
that for simplicity, but without loss of generality, we have as¬ 
sumed that the group is the whole population. In the absence 
of restrictions on the interaction range of players Jsst], i.e., in 
well-mixed populations, the size of the groups and their for¬ 
mation can be shown to be of no relevance in our case, as long 
as R rather than r is considered as the effective rate of return. 

The directional learning dynamics is implemented as fol¬ 
lows. Suppose the above game is infinitely repeated at time 
steps t = 0,1,2,..., and suppose further that i, at time t, plays 
c\ = l with probability p\ G [^, 1 — and c\—Q with proba¬ 
bility (1 — pi). Let the vector of contribution probabilities p* 
describe the state of the game at time t. We initiate the game 
with all lying on the J-grid between 0 and 1, while subse¬ 
quently individual mixed strategies evolve randomly subject 
to the following three “directional bias” rules: 

upward: if Ui{cl) > u^{cl and c\ > c\ \ or if Ui{cf) < 
and c) < c-“\ then = p\ + 5 if p\ < 1\ 
otherwise, = pi- 

neutral: if Mi(c-) = and/or c- = c-“^, thenp-^^ = 

p\ ,p\+S, or p\ —5 with equal probability if 0 < F- < 1; 
otherwise, = p\. 

downward: if Wi(c-) > and c- < c-“\ or if 

Ui(c‘) < Ui{cl~^) and c* > c-“\ thenp‘+^ = p* - J if 
pI > 0; otherwise, = p). 

Note that the second, neutral rule above allows random de¬ 
viations from any intermediate probability 0 < Pi < 1. How¬ 
ever, Pi = 0 and Pi = 1 for all i are absorbing state candi¬ 
dates. We therefore introduce perturbations to this directional 
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learning dynamics and study the resulting stationary states. In 
particular, we consider perturbations of order e such that, with 
probability 1 — e, the dynamics is governed by the original 
three “directional bias” rules. However, with probability e, ei¬ 
ther = P-, - S or ^ pI + S happens 

equally likely (with probability e/3) but of course obeying the 
G [0,1] restriction. 


Provisioning of public goods 

We begin with a formal definition of the k—strong equilib¬ 
rium. In particular, a pure strategy imputation s* is a fc-strong 
equilibrium of our (symmetric) public goods game if, for all 
C C N with ICI < k, Ui{sQ] s](r\c) — ®w\c) 

i € C for any alternative pure strategy set s'q for C. As noted 
in the previous section, this definition bridges, one the one 
hand, the concept of the Nash equilibrium in pure strategies 
ll^ in the sense that any fc—strong equilibrium with k > 0 
is also a Nash equilibrium, and, on the other hand, that of the 
(Aumann-)strong equilibrium ll^ |4^ in the sense that any 
fc—strong equilibrium with k — n is Aumann strong. Equi¬ 
libria in between (for 1 < fc < n) are “more stable” than 
a Nash equilibrium, but “less stable” than an Aumann-strong 
equilibrium. 

The maximal fc-strengths of the equilibria in our public 
goods game as a function of r are depicted in Fig. □ for 
n = 16. The cyan-shaded region indicates the “public bad 
game” region for r < 1 (i? < 1/n), where the individual 
and the public motives in terms of the Nash equilibrium of 
the game are aligned towards defection. Here Ci = 0 for 
all i is the unique Aumann-strong equilibrium, or in terms 
of the definition of the A:—strong equilibrium, Ci — 0 for all 
i is fc—strong for all k G [l,n]. The magenta-shaded re¬ 
gion indicates the typical public goods game for 1 < r < n 
(l/n < R < 1), where individual and public motives are con¬ 
flicting. Here there exists no Aumann-strong equilibria. The 
outcome Cj = 0 for all i is the unique Nash equilibrium, and 
that outcome is also fc-strong equilibrium for some fc G [!,«.), 
where the size of fc depends on r and n in that dk/dr < 0 
while dk/dn > 0. Finally, the gray-shaded region indicates 
the unconflicted public goods game for r > n (R > 1), where 
individual and public motives are again aligned, but this time 
towards cooperation. Here Ci = 1 for all i abruptly becomes 
the unique Nash and Aumann-strong equilibrium, or equiva¬ 
lently the unique fc—strong equilibrium for all fc G [l,n]. 

If we add perturbations of order e to the unperturbed public 
goods game with directional learning that we have introduced 
in section 2, there exist stationary distributions of pi and the 
following proposition can be proven. In the following, we 
denote by “fc” the maximal fc—strength of an equilibrium. 

Proposition: As f —^ c», starting at any the expectation 
with respect to the stationary distribution is E[p*^] > 
l/2ifi? > 1 and E[p*] < 1/2 iff? < 1. dE[p*]/d€ < 
0 if i? > 1, and dE[p*']/d€ > 0 if i? < 1. Moreover, 
dE[p*]/d5 > 0, and dE[p*]/dS < 0 if i? > 1. Finally, 
dE[pydk < Oif i? < 1. 


r 
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FIG. 1: The maximal fc-strength of equilibria in the studied public 
goods game with directional learning. As an example, we consider 
the population size being n = 16. As the rate of return r increases 
above 1, the Aumann-strong (n— strong) Ci = 0 for all i (full defec¬ 
tion) equilibrium looses strength. It is still the unique Nash equilib¬ 
rium, but its maximal strength is bounded by fc = 17 — r. As the 
rate of return r increases further above n(R > 1), the c; = 1 for all 
i (full cooperation) equilibrium suddenly becomes Aumann-strong 
(n—strong). Shaded regions denote the public bad game (r < 1), 
and the public goods games with conflicting (1 < r < n) and aligned 
(R > 1) individual and public motives in terms of the Nash equilib¬ 
rium of the game (see main text for details). We note that results for 
other population and/or group sizes are the same over R, while r and 
the slope of the red line of course scale accordingly. 


We begin the proof by noting that the perturbed process 
given by our dynamics results in an irreducible and aperi¬ 
odic Markov chain, which has a unique stationary distribution. 
When e = 0, any absorbing state must have p* = 0 or 1 for 
all players. This is clear from the positive probability paths 
to either extreme from intermediate states given by the unper¬ 
turbed dynamics. We shall now analyze whether p* = 0 or 1, 
given that p‘ = 0 or 1 for all j ^ i, has a larger attraction 
given the model’s underlying parameters. 

If i? > 1, the probability path for any player to move from 
p‘ = 0 to p*"*"^ = 1 in some T = 1/6 steps requires a single 
perturbation for that player and is therefore of the order of a 
single e. By contrast, the probability for any player to move 
from p* = 1 to = 0 in r steps is of the order e^, because 
at least two other players must increase their contribution in 
order for that player to experience a payoff increase from his 
non-contribution. Along any other path or if p* is such that 
there are not two players j with p* = 0 to make this move, 
then the probability for i to move from p‘ = 1 to = 0 
in T steps requires even more perturbations and is of higher 
order. Notice that, for any one player to move from p‘ = 0 
to = 1 we need at least two players to move away from 
pI = 0 along the least-resistance paths. Because contributing 
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FIG. 2: Color-encoded average contribution levels in the unperturbed 
public goods game with directional learning. Simulations confirm 
that, with little directional learning sensitivity (i.e. when <5 is zero 
or very small), for the marginal per-capita rate of return i? > 1 the 
outcome d = 1 for all i is the unique Nash and Aumann-strong equi¬ 
librium. For i? = 1 (dashed horizontal line), any outcome is a Nash 
equilibrium, but only a = 1 for all i is Aumann-strong while all 
other outcomes are only Nash equilibria. For R < 1, a = 0 for all i 
is the unique Nash equilibrium, and its maximal fc—strength depends 
on the population size. This is in agreement with results presented 
in Fig. [T] Importantly, however, as the responsiveness of players in¬ 
creases, contributions to the common pool become significant even in 
the defection-prone i? < 1—region. In effect, individuals’ (mis)leam 
what is best for them and end up contributing even though this would 
not be a unilateral best reply. Similarly, in the R > 1 region free¬ 
riding starts to spread despite of the fact that it is obviously better 
to cooperate. For both these rather surprising and counterintuitive 
outcomes to emerge, the only thing needed is directional learning. 


1 is a best reply for all i? > 1, those two players will also 
continue to increase if continuing to contribute 1. Notice that 
the length of the path is T = 1/5 steps, and that the path 
requires no perturbations along the way, which is less likely 
the smaller 5. 

If i? < 1, the probability for any player to move from 
p* = 1 to = 0 in some T — 1/5 steps requires a sin¬ 
gle perturbation for that player and is therefore of the order 
of a single e. By contrast, the probability for any player to 
move from p* = 0 to p*’*'^ = 1 in some T steps is at least 
of the order e^, because at least k players (corresponding to 
the maximal fc-strength of the equilibrium) must contribute in 
order for all of these players to experience a payoff increase. 
Notice that k decreases in R. Again, the length of the path is 
T = 1/(5 steps, and that path requires no perturbations along 
the way, which is less likely the smaller 5. With this, we con¬ 
clude the proof of the proposition. However, it is also worth 
noting a direct corollary of the proposition; namely, as e —0, 
E[p*] ^ 1 if i? > 1, and E[p*] ^ 0 if i? < 1. 

Lastly, we simulate the perturbed public goods game with 
directional learning and determine the actual average contri¬ 


bution levels in the stationary state. Color encoded results in 
dependence on the normalized rate of return R and the respon¬ 
siveness of players to the success of their past actions 5 (alter¬ 
natively, the sensitivity of the individual learning process) are 
presented in Fig. |2]for e = 0.1. Small values of 5 lead to 
a close convergence to the respective Nash equilibrium of the 
game, regardless of the value of R. As the value of 5 increases, 
the pure Nash equilibria erode and give way to a mixed out¬ 
come. It is important to emphasize that this is in agreement, or 
rather, this is in fact a consequence of the low fc—strengths of 
the non-contribution pure equilibria (see Fig[Tli. Within inter¬ 
mediate to large 5 values the Nash equilibria are implemented 
in a zonal rather than pinpoint way. When the Nash equilib¬ 
rium is such that all players contribute {R > 1), then small 
values of 5 lead to more efficient aggregate play (recall any 
such equilibrium is n— strong). Conversely, by the same logic, 
when the Nash equilibrium is characterized by universal free¬ 
riding, then larger values of 5 lead to more efficient aggregate 
play. Moreover, the precision of implementation also depends 
on the rate of return in the sense that uncoordinated devia¬ 
tions of groups of players lead to more efficient outcomes the 
higher the rate of return. In other words, the free-riding prob¬ 
lem is mitigated if group deviations lead to higher payoffs for 
every member of an uncoordinated deviation group, the mini¬ 
mum size of which (that in turn is related to the maximal fc- 
strength of equilibrium) is decreasing with the rate of return. 

Simulations also confirm that the evolutionary outcome is 
qualitatively invariant to: i) The value of e as long as the lat¬ 
ter is bounded away from zero, although longer convergence 
times are an inevitable consequence of very small e values 
(see Fig. Ell; ii) The replication of the population (i.e., mak¬ 
ing the whole population a group) and the random remixing 
between groups; and iii) The population size, although here 
again the convergence times are the shorter the smaller the 
population size. While both ii and iii are a direct consequence 
of the fact that we have considered the public goods game in a 
well-mixed rather than a structured population (where players 
would have a limited interaction range and where thus pat¬ 
tern formation could play a decisive role ifJsl] ). the qualita¬ 
tive invariance to the value of e is elucidated further in Fig. [3 
We would like to note that by “qualitative invariance” it is 
meant that, regardless of the value of e > 0, the population al¬ 
ways diverges away from the Nash equilibrium towards a sta¬ 
ble mixed stationary state. But as can be observed in Fig. [3 
the average contribution level and its variance both increase 
slightly as e increases. This is reasonable if one considers e as 
an exploration or mutation rate. More precisely, it can be ob¬ 
served that, the lower the value of e, the longer it takes for the 
population to move away from the Nash equilibrium where 
everybody contributes zero in the case that 1/n < i? < 1 
(which was also the initial condition for clarity). However, as 
soon as initial deviations (from pi = 0 in this case) emerge 
(with probability proportional to e), the neutral rule in the 
original learning dynamics takes over, and this drives the pop¬ 
ulation towards a stable mixed stationary state. Importantly, 
even if the value of e is extremely small, the random drift 
sooner or later gains momentum and eventually yields simi¬ 
lar contribution levels as those attainable with larger values of 
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FIG. 3: Time evolution of average contribution levels, as obtained 
for i? = 0.7, (5 = 0.1 and different values of e (see legend). If only 
e > 0, the Nash equilibrium erodes to a stationary state where at 
least some members of the population always contribute to the com¬ 
mon pool. There is a discontinuous transition to complete free-riding 
(defection) as e —^ 0. Understandably, the lower the value of e (the 
smaller the probability for the perturbation), the longer it may take 
for the drift to gain on momentum and for the initial deviation to 
evolve towards the mixed stationary state. Note that the time hori¬ 
zontally is in logarithmic scale. 


e. Most importantly, note that there is a discontinuous jump 
towards staying in the Nash equilibrium, which occurs only 
if e is exactly zero. If e is bounded away from zero, then the 
free-riding Nash equilibrium erodes unless it is n—strong (for 
very low values of i? < 1/n). 


These results have some rather exciting implications. Fore¬ 
most, the fact that the provisioning of public goods even un¬ 
der adverse conditions can be explained without any sophis¬ 
ticated and often lengthy arguments involving selflessness or 
social preference holds promise of significant simplifications 
of the rationale behind seemingly irrational individual behav¬ 
ior in sizable groups. It is simply enough for a critical number 
(depending on the size of the group and the rate of return) 
of individuals to make a “wrong choice” at the same time 
once, and if only the learning process is sufficiently fast or 
naive, the whole subpopulation is likely to adopt this wrong 
choice as their own at least part of the time. In many real- 
world situations, where the rationality of decision making is 
often compromised due to stress, propaganda or peer pres¬ 
sure, such “wrong choices” are likely to proliferate. As we 
have shown in the context of public goods games, sometimes 
this means more prosocial behavior, but it can also mean more 
free-riding, depending only on the rate of return. 

The power of directional (mis)learning to stabilize unilater¬ 
ally suboptimal game play of course takes nothing away from 
the more traditional and established explanations, but it does 
bring to the table an interesting option that might be appealing 
in many real-life situations, also those that extend beyond the 
provisioning of public goods. Fashion trends or viral tweets 
and videos might all share a component of directional learn¬ 
ing before acquiring mainstream success and recognition. We 
hope that our study will be inspirational for further research 
in this direction. The consideration of directional learning in 
structured populations llddl for example, appears to be a 
particularly exciting future venture. 


Methods 


Discussion 

We have introduced a public goods game with directional 
learning, and we have studied how the level of contributions to 
the common pool depends on the rate of return and the respon¬ 
siveness of individuals to the successes and failures of their 
own past actions. We have shown that directional learning 
alone suffices to explain deviations from the Nash equilibrium 
in the stationary state of the public goods game. Even though 
players have no strategically relevant information about the 
game and/ or about each others’ actions, the population could 
still end up in a mixed stationary state where some players 
contributed at least part of the time although the Nash equilib¬ 
rium would be full free-riding. Vice versa, defectors emerged 
where cooperation was clearly the best strategy to play. We 
have explained these evolutionary outcomes by introducing 
the concept of fc—strong equilibria, which bridge the gap be¬ 
tween Nash and Aumann-strong equilibria. We have demon¬ 
strated that the lower the maximal /c—strength and the higher 
the responsiveness of individuals to the consequences of their 
own past strategy choices, the more likely it is for the popu¬ 
lation to (mis)learn what is the objectively optimal unilateral 
(Nash) play. 


For the characterization of the stationary states, we intro¬ 
duce the concept of fc—strong equilibria, which nests both 
the Nash equilibrium il and the Aumann-strong equilib¬ 
rium 0391 kol] as two special cases. While the Nash equi¬ 
librium describes the robustness of an outcome against uni¬ 
lateral (1-person) deviations, the Aumann-strong equilibrium 
describes the robustness of an outcome against the deviations 
of any subgroup of the population. An equilibrium is said to 
be (Aumann-)strong if it is robust against deviations of the 
whole population or indeed of any conceivable subgroup of 
the population, which is indeed rare. Our definition of the 
fc—strong equilibrium bridges the two extreme cases, measur¬ 
ing the size of the group fc > 1 (at or above Nash) and hence 
the degree to which an equilibrium is stable. We note that 
our concept is related to coalition-proof equilibrium ll43ll44ll . 
In the public goods game, the free-riding Nash equilibrium is 
typically also more than 1—strong but never n—strong. As we 
will show, the maximal strength k of an equilibrium translates 
directly to the level of contributions in the stationary distribu¬ 
tion of our process, which is additionally determined by the 
normalized rate of return R and the responsiveness of players 
to the success of their past actions 6, i.e., the sensitivity of the 
individual learning process. 
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